A traffic signal and loop detector dataset of an urban intersection regulated by a fully actuated signal control system

Fully actuated signal controls are becoming increasingly popular in modern urban environments, attempting to reduce congestion locally, synchronize flows, or prioritize specific types of vehicles. This trend is expected to grow as more vehicles are expected to communicate via Vehicle-to-Infrastructure (V2I) communication. The presented dataset contains cleaned observations from a fully actuated signal control system with priority for public transportation. Time series data of traffic signals that regulate vehicle, public transportation, bicycle, and pedestrian traffic flows are available, showing where a traffic signal operates in a red or green phase. Also, loop detector data representing the occupancy at several locations at an urban intersection in Zurich, Switzerland is available. The data of all traffic signals and loop detectors corresponds to January and February 2019 and has a resolution of 1 s. Recent advances in transportation science show novel approaches for signalized intersections, but most publications assess their methodology on self-collected or simulated data. Therefore, the presented dataset aims at facilitating the development, calibration, and validation of novel methodological developments for modeling, estimation, forecasting, and other tasks in traffic engineering. Furthermore, it can be used as a real-world benchmark dataset for objectively comparing different methodologies.

Fully actuated signal controls are becoming increasingly popular in modern urban environments, attempting to reduce congestion locally, synchronize flows, or prioritize specific types of vehicles. This trend is expected to grow as more vehicles are expected to communicate via Vehicle-to-Infrastructure (V2I) communication. The presented dataset contains cleaned observations from a fully actuated signal control system with priority for public transportation. Time series data of traffic signals that regulate vehicle, public transportation, bicycle, and pedestrian traffic flows are available, showing where a traffic signal operates in a red or green phase. Also, loop detector data representing the occupancy at several locations at an urban intersection in Zurich, Switzerland is available. The data of all traffic signals and loop detectors corresponds to January and February 2019 and has a resolution of 1 s. Recent advances in transportation science show novel approaches for signalized intersections, but most publications assess their methodology on self-collected or simulated data. Therefore, the presented dataset aims at facilitating the development, calibration, and validation of novel methodological developments for modeling, estimation, forecasting, and other tasks in traffic engineering. Furthermore, it can be used as a real-world benchmark dataset for objectively comparing different methodologies.
© 2023 The Author(s

Value of the Data
• Due to the scarcity of high-resolution traffic signal and loop detector data from an actuated traffic signal control, this dataset is significant to support the design of methodologies capturing non-linear spatial-temporal traffic relationships. • Researchers and data scientists can benefit from the dataset when developing and testing estimation and prediction models, e.g., traffic flow, travel times, or signal phase timings. • The dataset can also be useful for benchmarking to allow for a comparative analysis of various data-driven methodologies at urban intersections.

Data Description
The dataset consists of four comma-separated-value (CSV) files with 15 days of consecutive time series data each. The corresponding names are 'intersection_data_set_jan_01_15.csv', intersection_data_set_jan_16_30.csv', intersection_data_set_feb_01_15.csv', and 'intersec-tion_data_set_feb_16_28.csv'. Note that the second dataset for February only contains 12 days. In total, two months of data are available for January and February 2019. The data includes the device states for all traffic lights and loop detectors (LD) at an urban intersection. Every data sample (a row entry in the dataset) corresponds to one observation. The provided data resolution is 1 s. Each CSV file provides the same data structure defined as follows: • Time: Time of the recorded sample containing year, month, day, and timestamp in the format YYYY-MM-DD HH:MM:SS. • sg1 -sg12: Time series of traffic signals 1-12 with corresponding states 0 or 1.
Every traffic signal (SG1 to SG12) in the CSV files is represented as a time series with two states. The state 0 indicates that a traffic signal operates in a red phase. State 1 describes a green phase when vehicles/pedestrians are allowed to cross the intersection. Fig. 1 depicts a sample signal of a traffic light (in blue) with two red phases and one green phase (highlighted by the red/green colored areas, respectively. In Switzerland, the beginning and end of a green phase are indicated with a short red/yellow and yellow phase, respectively. As these phases are represented as constants (1 s and 3 s, respectively) and federal legislation does not allow vehicles to cross the intersection, they are assigned to the red phase [4] . All LDs in the dataset are also provided as time series with the states 0 and 1. The state 0 represents an LD not occupied, i.e., no vehicle currently passes the location where the LD is implemented. On the other hand, if an LD shows a pulse, meaning that the signal state is 1, a vehicle occupies the LD. Fig. 2 depicts a sample signal of an LD with six detections and the corresponding signal states.

Experimental Design, Materials, and Methods
The collected dataset represents traffic signal and LD data from a four-leg intersection in Zurich, Switzerland. The intersection, depicted in Fig. 3 , is regulated by a fully-actuated signal control system, meaning that red, green, and cycle times are non-constant [5 , 6] . The variation of these is due to the priority of public transportation and the extension or shortage of signal phases based on the intersection approach's traffic demand [7] . The speed limit for the intersection's northbound and southbound approach is 30 km/h. For approaches from west and east, the maximum speed is regulated at 50 km/h. In Fig. 3 , the traffic signals are shown with circled numbers. There are 12 traffic signals installed at the stop lines. Signals 1, 2, 4, 5, and 6 control vehicular traffic streams and follow a red-green signalization pattern. Traffic signal 3 regulates only bicycle flows that can travel straight to the south. Pedestrian flows are regulated by signals 7-10. From north to south and vice versa, multiple tram lines frequently operate between 7:00 and 22:00. The tram tracks are indicated by dashed lines and overlap with car lanes for the southbound approach. The signal control potentially prioritizes public transportation, and signals 11 and 12 represent the designated traffic lights for trams. This subset of traffic lights only operates in a green phase when trams arrive at the intersection.
The city of Zurich operates the signal control system of the intersection. A centralized system provides log files of the control system that contain event-based telegrams (i.e., records). Every telegram contains the time of the event, an identifier of the control system, the device identifier that triggered the event, and the new device state. For example, if traffic signal 1 changes from a red phase to a green phase, the telegram would contain a device-id of 'sg1' and the new state of 1. The log files contain every new event as a one-line string entry. Algorithm 1 contains a pseudo code that allows parsing log files into a tabular dataset. A given list of log files is processed line by line, and the device state information is stored in a new data structure. The implementation is available from https://github.com/TrafficControlDataset/data-preparation-suite . Note that the log details are not included in the implementation as the information is not public. For the compilation of the presented dataset, regular expressions are designed to automatically extract the timestamps, device identifiers, and the corresponding state. The telegrams are event-based, i.e., only when a device changes its state a telegram is sent. Therefore, the time axis is unevenly spaced. We construct a consecutive time axis (resolution of 1 s) and impute the tracked states so that the dataset provides an evenly spaced time series. The procedure is applied for all available log files and all available devices implemented at the intersection.
In the following, details about the location and traffic flows regulated by traffic lights 1-12 are presented. For all available traffic lights, Table 1 represents the location, the controlled transport mode, and the traffic flow characteristics (direction of traffic arrival and departure for all traffic signals, respectively).   For the LDs available in the dataset, Table 2 lists the device name, the location, the detected transport modes, and the traffic light that regulates the detected traffic stream. Note that the location is measured (a) from the stop line if a detector is implemented upstream of a traffic light, and (b) from the pedestrian crossing if implemented downstream.
No separate detector data is implemented for traffic signals 1, 3, and 7-9. Hence, no information on arriving vehicles, cyclists, or pedestrians is available. Note that the LDs in this dataset do not allow for the identification of the vehicle type. LDs 1-3 are able to provide only PT detections, as the sensors are implemented on a dedicated lane for PT.

Ethics Statements
The authors declare that this work does not involve the use of human subjects, social media data, or experimentation with animals.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
A traffic signal and loop detector dataset of an urban intersection regulated by a fully actuated signal control system (Original data) (ETH Research Collection). M. Menendez acknowledges the support of the NYUAD Center for Interacting Urban Networks (CITIES), funded by Tamkeen under the NYUAD Research Institute Award CG001.